Submitted by Kimberly Peters
● The dataset that we will be wrangling (and analyzing and visualizing) is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc. Why? Because "they're good dogs Brent." WeRateDogs has over 4 million followers and has received international media coverage.
● WeRateDogs downloaded their Twitter archive and sent it to Udacity to use in this project. This archive contains basic tweet data (tweet ID, timestamp, text, etc.) for all 5000+ of their tweets as they stood on August 1, 2017.
● Main tasks in this project are as follows:
● Gathering Data
● Assessing Data
● Cleaning Data
Storing, analyzing, and visualizing your wrangled data
Reporting on: a) your data wrangling efforts b) your data analyses and visualizations
● Sharing tweets is one of the main keys for growing a twitter account.
● As a result, we try to use the data at our disposal towards getting a better understanding on what influences the tweet sharing, kind of dogs (breeds, stages) and rating, in order to give some insight to the marketing team.
● Let us recall some general statistics, prior to starting analysis. The master clean dataframe has 1976 observations.
This chart shows the most popular breed of dogs in the dataset
This chart shows the most popular dog stage in the dataset
This chart shows the relationship between dog breed and average favourite count
● According to chart 1, the Golden Retriever is the most popular rated dog breed
● According to chart 2, floofers are the least rated dogs with puppers being the most
● According to chart 3, the Afghan_hound has the highest average favourite count